77 research outputs found
Learning Less is More - 6D Camera Localization via 3D Surface Regression
Popular research areas like autonomous driving and augmented reality have
renewed the interest in image-based camera localization. In this work, we
address the task of predicting the 6D camera pose from a single RGB image in a
given 3D environment. With the advent of neural networks, previous works have
either learned the entire camera localization process, or multiple components
of a camera localization pipeline. Our key contribution is to demonstrate and
explain that learning a single component of this pipeline is sufficient. This
component is a fully convolutional neural network for densely regressing
so-called scene coordinates, defining the correspondence between the input
image and the 3D scene space. The neural network is prepended to a new
end-to-end trainable pipeline. Our system is efficient, highly accurate, robust
in training, and exhibits outstanding generalization capabilities. It exceeds
state-of-the-art consistently on indoor and outdoor datasets. Interestingly,
our approach surpasses existing techniques even without utilizing a 3D model of
the scene during training, since the network is able to discover 3D scene
geometry automatically, solely from single-view constraints.Comment: CVPR 201
PoseAgent: Budget-Constrained 6D Object Pose Estimation via Reinforcement Learning
State-of-the-art computer vision algorithms often achieve efficiency by
making discrete choices about which hypotheses to explore next. This allows
allocation of computational resources to promising candidates, however, such
decisions are non-differentiable. As a result, these algorithms are hard to
train in an end-to-end fashion. In this work we propose to learn an efficient
algorithm for the task of 6D object pose estimation. Our system optimizes the
parameters of an existing state-of-the art pose estimation system using
reinforcement learning, where the pose estimation system now becomes the
stochastic policy, parametrized by a CNN. Additionally, we present an efficient
training algorithm that dramatically reduces computation time. We show
empirically that our learned pose estimation procedure makes better use of
limited resources and improves upon the state-of-the-art on a challenging
dataset. Our approach enables differentiable end-to-end training of complex
algorithmic pipelines and learns to make optimal use of a given computational
budget
Learning Analysis-by-Synthesis for 6D Pose Estimation in RGB-D Images
Analysis-by-synthesis has been a successful approach for many tasks in
computer vision, such as 6D pose estimation of an object in an RGB-D image
which is the topic of this work. The idea is to compare the observation with
the output of a forward process, such as a rendered image of the object of
interest in a particular pose. Due to occlusion or complicated sensor noise, it
can be difficult to perform this comparison in a meaningful way. We propose an
approach that "learns to compare", while taking these difficulties into
account. This is done by describing the posterior density of a particular
object pose with a convolutional neural network (CNN) that compares an observed
and rendered image. The network is trained with the maximum likelihood
paradigm. We observe empirically that the CNN does not specialize to the
geometry or appearance of specific objects, and it can be used with objects of
vastly different shapes and appearances, and in different backgrounds. Compared
to state-of-the-art, we demonstrate a significant improvement on two different
datasets which include a total of eleven objects, cluttered background, and
heavy occlusion.Comment: 16 pages, 8 figure
Learning to Predict Dense Correspondences for 6D Pose Estimation
Object pose estimation is an important problem in computer vision with applications in robotics, augmented reality and many other areas. An established strategy for object pose estimation consists of, firstly, finding correspondences between the image and the object’s reference frame, and, secondly, estimating the pose from outlier-free correspondences using Random Sample Consensus (RANSAC). The first step, namely finding correspondences, is difficult because object appearance varies depending on perspective, lighting and many other factors. Traditionally, correspondences have been established using handcrafted methods like sparse feature pipelines.
In this thesis, we introduce a dense correspondence representation for objects, called object coordinates, which can be learned. By learning object coordinates, our pose estimation pipeline adapts to various aspects of the task at hand. It works well for diverse object types, from small objects to entire rooms, varying object attributes, like textured or texture-less objects, and different input modalities, like RGB-D or RGB images. The concept of object coordinates allows us to easily model and exploit uncertainty as part of the pipeline such that even repeating structures or areas with little texture can contribute to a good solution. Although we can train object coordinate predictors independent of the full pipeline and achieve good results, training the pipeline in an end-to-end fashion is desirable. It enables the object coordinate predictor to adapt its output to the specificities of following steps in the pose estimation pipeline. Unfortunately, the RANSAC component of the pipeline is non-differentiable which prohibits end-to-end training. Adopting techniques from reinforcement learning, we introduce Differentiable Sample Consensus (DSAC), a formulation of RANSAC which allows us to train the pose estimation pipeline in an end-to-end fashion by minimizing the expectation of the final pose error
Two-View Geometry Scoring Without Correspondences
Camera pose estimation for two-view geometry traditionally relies on RANSAC.
Normally, a multitude of image correspondences leads to a pool of proposed
hypotheses, which are then scored to find a winning model. The inlier count is
generally regarded as a reliable indicator of "consensus". We examine this
scoring heuristic, and find that it favors disappointing models under certain
circumstances. As a remedy, we propose the Fundamental Scoring Network (FSNet),
which infers a score for a pair of overlapping images and any proposed
fundamental matrix. It does not rely on sparse correspondences, but rather
embodies a two-view geometry model through an epipolar attention mechanism that
predicts the pose error of the two images. FSNet can be incorporated into
traditional RANSAC loops. We evaluate FSNet on fundamental and essential matrix
estimation on indoor and outdoor datasets, and establish that FSNet can
successfully identify good poses for pairs of images with few or unreliable
correspondences. Besides, we show that naively combining FSNet with MAGSAC++
scoring approach achieves state of the art results
CONSAC: Robust Multi-Model Fitting by Conditional Sample Consensus
We present a robust estimator for fitting multiple parametric models of the
same form to noisy measurements. Applications include finding multiple
vanishing points in man-made scenes, fitting planes to architectural imagery,
or estimating multiple rigid motions within the same sequence. In contrast to
previous works, which resorted to hand-crafted search strategies for multiple
model detection, we learn the search strategy from data. A neural network
conditioned on previously detected models guides a RANSAC estimator to
different subsets of all measurements, thereby finding model instances one
after another. We train our method supervised as well as self-supervised. For
supervised training of the search strategy, we contribute a new dataset for
vanishing point estimation. Leveraging this dataset, the proposed algorithm is
superior with respect to other robust estimators as well as to designated
vanishing point estimation algorithms. For self-supervised learning of the
search, we evaluate the proposed algorithm on multi-homography estimation and
demonstrate an accuracy that is superior to state-of-the-art methods.Comment: CVPR 202
More Filtering on SNP Calling Does Not Remove Evidence of Inter-Nucleus Recombination in Dikaryotic Arbuscular Mycorrhizal Fungi
Evidence for the existence of dikaryote-like strains, low nuclear sequence diversity and inter-nuclear recombination in arbuscular mycorrhizal fungi has been recently reported based on single nucleus sequencing data. Here, we aimed to support evidence of inter-nuclear recombination using an approach that filters SNP calls more conservatively, keeping only positions that are exclusively single copy and homozygous, and with at least five reads supporting a given SNP. This methodology recovers hundreds of putative inter-nucleus recombination events across publicly available sequence data from individual nuclei. Challenges related to the acquisition and analysis of sequence data from individual nuclei are highlighted and discussed, and ways to address these issues in future studies are presented
- …